44 research outputs found

    Parallel architectures and runtime systems co-design for task-based programming models

    Get PDF
    The increasing parallelism levels in modern computing systems has extolled the need for a holistic vision when designing multiprocessor architectures taking in account the needs of the programming models and applications. Nowadays, system design consists of several layers on top of each other from the architecture up to the application software. Although this design allows to do a separation of concerns where it is possible to independently change layers due to a well-known interface between them, it is hampering future systems design as the Law of Moore reaches to an end. Current performance improvements on computer architecture are driven by the shrinkage of the transistor channel width, allowing faster and more power efficient chips to be made. However, technology is reaching physical limitations were the transistor size will not be able to be reduced furthermore and requires a change of paradigm in systems design. This thesis proposes to break this layered design, and advocates for a system where the architecture and the programming model runtime system are able to exchange information towards a common goal, improve performance and reduce power consumption. By making the architecture aware of runtime information such as a Task Dependency Graph (TDG) in the case of dataflow task-based programming models, it is possible to improve power consumption by exploiting the critical path of the graph. Moreover, the architecture can provide hardware support to create such a graph in order to reduce the runtime overheads and making possible the execution of fine-grained tasks to increase the available parallelism. Finally, the current status of inter-node communication primitives can be exposed to the runtime system in order to perform a more efficient communication scheduling, and also creates new opportunities of computation and communication overlap that were not possible before. An evaluation of the proposals introduced in this thesis is provided and a methodology to simulate and characterize the application behavior is also presented.El aumento del paralelismo proporcionado por los sistemas de cómputo modernos ha provocado la necesidad de una visión holística en el diseño de arquitecturas multiprocesador que tome en cuenta las necesidades de los modelos de programación y las aplicaciones. Hoy en día el diseño de los computadores consiste en diferentes capas de abstracción con una interfaz bien definida entre ellas. Las limitaciones de esta aproximación junto con el fin de la ley de Moore limitan el potencial de los futuros computadores. La mayoría de las mejoras actuales en el diseño de los computadores provienen fundamentalmente de la reducción del tamaño del canal del transistor, lo cual permite chips más rápidos y con un consumo eficiente sin apenas cambios fundamentales en el diseño de la arquitectura. Sin embargo, la tecnología actual está alcanzando limitaciones físicas donde no será posible reducir el tamaño de los transistores motivando así un cambio de paradigma en la construcción de los computadores. Esta tesis propone romper este diseño en capas y abogar por un sistema donde la arquitectura y el sistema de tiempo de ejecución del modelo de programación sean capaces de intercambiar información para alcanzar una meta común: La mejora del rendimiento y la reducción del consumo energético. Haciendo que la arquitectura sea consciente de la información disponible en el modelo de programación, como puede ser el grafo de dependencias entre tareas en los modelos de programación dataflow, es posible reducir el consumo energético explotando el camino critico del grafo. Además, la arquitectura puede proveer de soporte hardware para crear este grafo con el objetivo de reducir el overhead de construir este grado cuando la granularidad de las tareas es demasiado fina. Finalmente, el estado de las comunicaciones entre nodos puede ser expuesto al sistema de tiempo de ejecución para realizar una mejor planificación de las comunicaciones y creando nuevas oportunidades de solapamiento entre cómputo y comunicación que no eran posibles anteriormente. Esta tesis aporta una evaluación de todas estas propuestas, así como una metodología para simular y caracterizar el comportamiento de las aplicacionesPostprint (published version

    The Promotion of Graduate Programs through Clustering Prospective Students

    Get PDF
    The promotion of academic programs, particularly at graduate levels, emerges as a response to market changes. In general, graduate programs are not a first order necessity which makes necessary the right promotion of such programs guarantee the attraction of prospective students, which enroll in some of them, which is essential for the financial sustainability of universities. Notably, the last one is a crucial problem for private universities. In this paper, we analyze the prospective students that enroll in a private to design better promotion strategies by using on data gathered by online sources. Specifically, we use clustering techniques to define marketing strategies based on segments of students. We find that age and city are crucial to promoting graduate programs while marital status and sex does not impact the decision of students in the university that we analyze

    Assessing the Suitability of King Topologies for Interconnection Networks

    Get PDF
    In the late years many different interconnection networks have been used with two main tendencies. One is characterized by the use of high-degree routers with long wires while the other uses routers of much smaller degree. The latter rely on two-dimensional mesh and torus topologies with shorter local links. This paper focuses on doubling the degree of common 2D meshes and tori while still preserving an attractive layout for VLSI design. By adding a set of diagonal links in one direction, diagonal networks are obtained. By adding a second set of links, networks of degree eight are built, named king networks. This research presents a comprehensive study of these networks which includes a topological analysis, the proposal of appropriate routing procedures and an empirical evaluation. King networks exhibit a number of attractive characteristics which translate to reduced execution times of parallel applications. For example, the execution times NPB suite are reduced up to a 30 percent. In addition, this work reveals other properties of king networks such as perfect partitioning that deserves further attention for its convenient exploitation in forthcoming high-performance parallel systems

    Runtime-guided management of scratchpad memories in multicore architectures

    Get PDF
    © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.The increasing number of cores and the anticipated level of heterogeneity in upcoming multicore architectures cause important problems in traditional cache hierarchies. A good way to alleviate these problems is to add scratchpad memories alongside the cache hierarchy, forming a hybrid memory hierarchy. This memory organization has the potential to improve performance and to reduce the power consumption and the on-chip network traffic, but exposing such a complex memory model to the programmer has a very negative impact on the programmability of the architecture. Emerging task-based programming models are a promising alternative to program heterogeneous multicore architectures. In these models the runtime system manages the execution of the tasks on the architecture, allowing them to apply many optimizations in a generic way at the runtime system level. This paper proposes giving the runtime system the responsibility to manage the scratchpad memories of a hybrid memory hierarchy in multicore processors, transparently to the programmer. In the envisioned system, the runtime system takes advantage of the information found in the task dependences to map the inputs and outputs of a task to the scratchpad memory of the core that is going to execute it. In addition, the paper exploits two mechanisms to overlap the data transfers with computation and a locality-aware scheduler to reduce the data motion. In a 32-core multicore architecture, the hybrid memory hierarchy outperforms cache-only hierarchies by up to 16%, reduces on-chip network traffic by up to 31% and saves up to 22% of the consumed power.Peer ReviewedPostprint (author's final draft

    Runtime-aware architectures

    Get PDF
    In the last few years, the traditional ways to keep the increase of hardware performance to the rate predicted by the Moore’s Law have vanished. When uni-cores were the norm, hardware design was decoupled from the software stack thanks to a well defined Instruction Set Architecture (ISA). This simple interface allowed developing applications without worrying too much about the underlying hardware, while hardware designers were able to aggressively exploit instruction-level parallelism (ILP) in superscalar processors. Current multi-cores are designed as simple symmetric multiprocessors (SMP) on a chip. However, we believe that this is not enough to overcome all the problems that multi-cores face. The runtime system of the parallel programming model has to drive the design of future multi-cores to overcome the restrictions in terms of power, memory, programmability and resilience that multi-cores have. In the paper, we introduce an approach towards a Runtime-Aware Architecture (RAA), a massively parallel architecture designed from the runtime’s perspective.This work has been partially supported by the European Research Council under the European Union’s 7th FP, ERC Grant Agreement number 321253, by the Spanish Ministry of Science and Innovation under grant TIN2012-34557 and by the HiPEAC Network of Excellence. M. Moreto has been partially supported by the Ministry of Economy and Competitiveness under Juan de la Cierva postdoctoral fellowship number JCI- 2012-15047, and M. Casas is supported by the Secretary for Universities and Research of the Ministry of Economy and Knowledge of the Government of Catalonia and the Co-fund programme of the Marie Curie Actions of the 7th R&D Framework Programme of the European Union (Contract 2013 BP B 00243).Peer ReviewedPostprint (author's final draft

    The evolution of the ventilatory ratio is a prognostic factor in mechanically ventilated COVID-19 ARDS patients

    Get PDF
    Background: Mortality due to COVID-19 is high, especially in patients requiring mechanical ventilation. The purpose of the study is to investigate associations between mortality and variables measured during the first three days of mechanical ventilation in patients with COVID-19 intubated at ICU admission. Methods: Multicenter, observational, cohort study includes consecutive patients with COVID-19 admitted to 44 Spanish ICUs between February 25 and July 31, 2020, who required intubation at ICU admission and mechanical ventilation for more than three days. We collected demographic and clinical data prior to admission; information about clinical evolution at days 1 and 3 of mechanical ventilation; and outcomes. Results: Of the 2,095 patients with COVID-19 admitted to the ICU, 1,118 (53.3%) were intubated at day 1 and remained under mechanical ventilation at day three. From days 1 to 3, PaO2/FiO2 increased from 115.6 [80.0-171.2] to 180.0 [135.4-227.9] mmHg and the ventilatory ratio from 1.73 [1.33-2.25] to 1.96 [1.61-2.40]. In-hospital mortality was 38.7%. A higher increase between ICU admission and day 3 in the ventilatory ratio (OR 1.04 [CI 1.01-1.07], p = 0.030) and creatinine levels (OR 1.05 [CI 1.01-1.09], p = 0.005) and a lower increase in platelet counts (OR 0.96 [CI 0.93-1.00], p = 0.037) were independently associated with a higher risk of death. No association between mortality and the PaO2/FiO2 variation was observed (OR 0.99 [CI 0.95 to 1.02], p = 0.47). Conclusions: Higher ventilatory ratio and its increase at day 3 is associated with mortality in patients with COVID-19 receiving mechanical ventilation at ICU admission. No association was found in the PaO2/FiO2 variation

    Search for dark matter produced in association with bottom or top quarks in √s = 13 TeV pp collisions with the ATLAS detector

    Get PDF
    A search for weakly interacting massive particle dark matter produced in association with bottom or top quarks is presented. Final states containing third-generation quarks and miss- ing transverse momentum are considered. The analysis uses 36.1 fb−1 of proton–proton collision data recorded by the ATLAS experiment at √s = 13 TeV in 2015 and 2016. No significant excess of events above the estimated backgrounds is observed. The results are in- terpreted in the framework of simplified models of spin-0 dark-matter mediators. For colour- neutral spin-0 mediators produced in association with top quarks and decaying into a pair of dark-matter particles, mediator masses below 50 GeV are excluded assuming a dark-matter candidate mass of 1 GeV and unitary couplings. For scalar and pseudoscalar mediators produced in association with bottom quarks, the search sets limits on the production cross- section of 300 times the predicted rate for mediators with masses between 10 and 50 GeV and assuming a dark-matter mass of 1 GeV and unitary coupling. Constraints on colour- charged scalar simplified models are also presented. Assuming a dark-matter particle mass of 35 GeV, mediator particles with mass below 1.1 TeV are excluded for couplings yielding a dark-matter relic density consistent with measurements

    Guidelines for the use and interpretation of assays for monitoring autophagy (3rd edition)

    Get PDF
    In 2008 we published the first set of guidelines for standardizing research in autophagy. Since then, research on this topic has continued to accelerate, and many new scientists have entered the field. Our knowledge base and relevant new technologies have also been expanding. Accordingly, it is important to update these guidelines for monitoring autophagy in different organisms. Various reviews have described the range of assays that have been used for this purpose. Nevertheless, there continues to be confusion regarding acceptable methods to measure autophagy, especially in multicellular eukaryotes. For example, a key point that needs to be emphasized is that there is a difference between measurements that monitor the numbers or volume of autophagic elements (e.g., autophagosomes or autolysosomes) at any stage of the autophagic process versus those that measure fl ux through the autophagy pathway (i.e., the complete process including the amount and rate of cargo sequestered and degraded). In particular, a block in macroautophagy that results in autophagosome accumulation must be differentiated from stimuli that increase autophagic activity, defi ned as increased autophagy induction coupled with increased delivery to, and degradation within, lysosomes (inmost higher eukaryotes and some protists such as Dictyostelium ) or the vacuole (in plants and fungi). In other words, it is especially important that investigators new to the fi eld understand that the appearance of more autophagosomes does not necessarily equate with more autophagy. In fact, in many cases, autophagosomes accumulate because of a block in trafficking to lysosomes without a concomitant change in autophagosome biogenesis, whereas an increase in autolysosomes may reflect a reduction in degradative activity. It is worth emphasizing here that lysosomal digestion is a stage of autophagy and evaluating its competence is a crucial part of the evaluation of autophagic flux, or complete autophagy. Here, we present a set of guidelines for the selection and interpretation of methods for use by investigators who aim to examine macroautophagy and related processes, as well as for reviewers who need to provide realistic and reasonable critiques of papers that are focused on these processes. These guidelines are not meant to be a formulaic set of rules, because the appropriate assays depend in part on the question being asked and the system being used. In addition, we emphasize that no individual assay is guaranteed to be the most appropriate one in every situation, and we strongly recommend the use of multiple assays to monitor autophagy. Along these lines, because of the potential for pleiotropic effects due to blocking autophagy through genetic manipulation it is imperative to delete or knock down more than one autophagy-related gene. In addition, some individual Atg proteins, or groups of proteins, are involved in other cellular pathways so not all Atg proteins can be used as a specific marker for an autophagic process. In these guidelines, we consider these various methods of assessing autophagy and what information can, or cannot, be obtained from them. Finally, by discussing the merits and limits of particular autophagy assays, we hope to encourage technical innovation in the field

    Search for single production of vector-like quarks decaying into Wb in pp collisions at s=8\sqrt{s} = 8 TeV with the ATLAS detector

    Get PDF

    Measurement of the charge asymmetry in top-quark pair production in the lepton-plus-jets final state in pp collision data at s=8TeV\sqrt{s}=8\,\mathrm TeV{} with the ATLAS detector

    Get PDF
    corecore